1. money_transacted has the highest correlation coefficient with respect to the attribute is_fraud. 2. partner_pricing_category and user_id seem to be slightly correlated. 3. All the other attributes show a very weak correlation.
1. It doesn't bring out a clear partern. Users who have committed a fraud are spread throughout. 2. Higher the value of money_transacted higher are the frauds committed.
1. Highest number of transactions have been done by e_wallet_payments and sbi_atm_cum_debit_card. 2. Highest value of transactions have been done by other_debit_cards. 3. All the (negative transactions) payments made by the merchant are done by unified_payments_interface and sbi_atm_cum_debit_card.
1. unified_payment_services is the only one where there has been no fraud, however the transactions are aslo the lowest and it remains the most unused. 2. The value of money_transacted is also only upto a few 100 rupees. 3. Beyond 25k only visa_master_debit cards seem to have a few transactions that aren't fraud, but that to unless the amount is under 50k.
There are 4 partners (47334, 23667, 78890, 118335) that make up for most of the transactions.
Partner 47334: deals with highest amount of transactions but in the lower range. Partner 118335: is the only partner that deals with transactions above 50k.
1. The partners like 39445 (19), 71001 (15), 173558 (89) that have less than 100 transactions and yet are resulting in fradulent payments, should likely be discontinued. 2. Partners reulting in fraudulent payments: 23667 (19526), 39445 (19), 47334 (26105), 71001 (15), 7889 (2317), 102557(231) ,118335 (9546), 165669 (1216), 173558 (89).
Any transaction above 50k is dealt by category 1.
1. Only categories 1, 2, 3, and 8 are the ones where frauds have been committed. 2. Category 2 is where there were frauds while a payment was being credited. 3. Category 1is where there were frauds while a payment greater than 50k was being debited.
Has zero variance and therefore isn't useful to us. So, dropping it.
other_pcs and android devices are mostly used to make the payments by the customers.
1. Paymetns credited by ios_devices were sucessful. But they are in the lower range and lesser in number, so can't make accurate assumptions.
As seen already values above 50k are outliers and always result in fradulent transactions, so the company needs to find a permanent fix for this. Apart from that the transactions happening on the lower scale are the real problems as there is no clear partern.
There is no particular partern as such, frauds have happend throughout the transaction_intiation.
Majority of the partners belong to category 2.
Selecting the most important features that explain the most variance using feature_importances_